• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

±¹³» ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ±¹³» ³í¹®Áö > Çѱ¹Á¤º¸°úÇÐȸ ³í¹®Áö > Á¤º¸°úÇÐȸ ÄÄÇ»ÆÃÀÇ ½ÇÁ¦ ³í¹®Áö (KIISE Transactions on Computing Practices)

Á¤º¸°úÇÐȸ ÄÄÇ»ÆÃÀÇ ½ÇÁ¦ ³í¹®Áö (KIISE Transactions on Computing Practices)

Current Result Document :

ÇѱÛÁ¦¸ñ(Korean Title) ¹üÁÖ ºÒ±ÕÇü ºÐ·ù ¹®Á¦¸¦ À§ÇÑ µ¿Àû ºñ¿ë ¹Î°¨ ÇнÀ ¹æ¹ý
¿µ¹®Á¦¸ñ(English Title) Dynamic Cost Sensitive Learning for Imbalanced Text Classification
ÀúÀÚ(Author) ½Åâ¿í   ¿ÀÁø¿µ   Â÷Á¤¿ø   Chang-Uk Shin   Jinyoung Oh   Jeong-Won Cha  
¿ø¹®¼ö·Ïó(Citation) VOL 26 NO. 04 PP. 0211 ~ 0216 (2020. 04)
Çѱ۳»¿ë
(Korean Abstract)
ÇнÀ µ¥ÀÌÅͼ ³» ºÐ·ù ¹üÁÖ ºÒ±ÕÇüÀº ±× µ¥ÀÌÅͼÂÀ¸·Î ÇнÀµÈ ºÐ·ù ¸ðÇü¿¡ ÆíÇâÀ» ¾ß±âÇÑ´Ù. º» ¿¬±¸¿¡¼­´Â ÁÖ¾îÁø ¹üÁÖ ºÒ±ÕÇü µ¥ÀÌÅͼÂÀ» ÀÌ¿ëÇØ ºÐ·ù ¸ðÇüÀ» ÇнÀÇÏ´Â µÎ °¡Áö »õ·Î¿î ºñ¿ë ¹Î°¨ ÇнÀ ¹æ¹ýÀ» Á¦¾ÈÇÑ´Ù. ù ¹ø° ºñ¿ë ¹Î°¨ ÇнÀ ¹æ¹ýÀº ÇнÀ ÄÚÆÛ½º ³» ¹üÁÖº° ¹ß»ý ºóµµ¿Í µð¸®Å¬·¹ ºÐÆ÷¸¦ ÀÌ¿ëÇÑ´Ù. µ¿Àû °¡ÁßÄ¡ ºÎ¿© ¹æ¹ýÀ̶ó ¸í¸íÇÑ ÀÌ ¹æ¹ýÀº µð¸®Å¬·¹ ºÐÆ÷¿¡¼­ Ç¥º»À» ÃßÃâÇÏ¿© ¸ðµ¨ ÇнÀÀÇ °¡ÁßÄ¡·Î½á »ç¿ëÇÑ´Ù. µÎ ¹ø° ¹æ¹ýÀº ÇнÀ ÄÚÆÛ½º ³» ¹üÁÖº° ¹ß»ý ºóµµ·Î Á¤´ä Ç¥ÇöÀ» º¯°æÇÏ¿© ºñ¿ë ¹Î°¨ ÇнÀÀ» ¼öÇàÇÑ´Ù. ÀÌ ¹æ¹ýÀº ÆÛÁö Á¤´ä Ç¥ÇöÀ̶ó ¸í¸íÇÏ¿´´Ù. ´ëÈ­¿¡¼­ ¹ßÈ­ÀÇ °¨Á¤°ú È­ÇàÀ» ºÐ·ùÇÏ´Â ¹®Á¦¿¡ Á¦¾È ¹æ¹ýÀ» Àû¿ëÇÏ¿´À» ¶§, MAP(Macro Average Precision) ±âÁØ È­Çà ¾à 1.1¢¦2.2%p, °¨Á¤ ¾à 0.9¢¦3.6%p °¡·®ÀÇ ¼º´É Çâ»óÀ» ¾òÀ» ¼ö ÀÖ¾ú´Ù. ½ÇÇè °á°ú¸¦ ÅëÇØ, Á¦¾È ¹æ¹ýÀÌ ¹üÁÖ ºÒ±ÕÇü µ¥ÀÌÅͼÂÀÇ ÇнÀ¿¡ È¿°úÀûÀÓÀ» È®ÀÎÇÏ¿´´Ù
¿µ¹®³»¿ë
(English Abstract)
Classification category imbalance in training dataset causes bias in the classification model. In this paper, we propose two new cost-sensitive training methods for training classification models using a given category imbalanced dataset. The first proposed method uses the occurrence rate by category in the dataset and the Dirichlet distribution. This method, called the dynamic weighting method, takes a sample from the distribution and uses that as the weight of the loss function. The second proposed method performs training by changing the expression of the answer by the occurrence rate of each category in the training corpus. This method is called fuzzy answer representation. When applying the proposed method to classify emotions and speech acts in the dialogue, the performance improvement of approximately 1.1-2.2%p for speech act classification and 0.9-3.6%p for emotion based on MAP(Macro Average Precision) was obtained. The experimental results showed that the proposed method is effective for training the category imbalanced dataset.
Å°¿öµå(Keyword) ÅؽºÆ® ºÐ·ù   ¹üÁÖ ºÒ±ÕÇü ºÐ·ù   ºñ¿ë ¹Î°¨ ÇнÀ   ¹ßÈ­ È­Çà ºÐ·ù   ¹ßÈ­ °¨Á¤ ºÐ·ù   text classification   category imbalanced classification   cost-sensitive learning   utterance   speech-act classification   utterance emotion classification  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå